AITopics | effect size

Many problems in computational science and engineering become one-to-many after coarse graining, partial observation, or inverse reconstruction: a resolved state may not determine a unique subgrid forcing, a structural descriptor may not determine a unique effective response, and a low-resolution observation may correspond to many plausible high-resolution fields. In such settings, deterministic surrogates may learn a well-defined mathematical object while still missing application-relevant uncertainty. This tutorial develops a self-contained module centered on the conditional-mean barrier: the point at which a squared-loss predictor has reached the conditional mean and the remaining error is irreducible aleatoric variance. We give two diagnostics for locating this barrier, residual-feature orthogonality and the coefficient of determination against its explained-variance ceiling, and prove that adding latent randomness to a squared-loss predictor collapses it back to the conditional mean. Crossing the barrier therefore requires a loss that scores distributions rather than point predictions. We briefly organize common distributional objectives, including negative log-likelihood, moment and observable matching, variational objectives, adversarial divergences, and score matching, by the feature of the conditional law each targets. The emphasis is the boundary itself and a finite-data procedure for recognizing it, rather than a survey of methods beyond it. CPU-based demonstrations on a two-branch law and a two-scale Lorenz-96 closure problem show how the diagnostics distinguish deterministic underfitting from residual distributional variability.

artificial intelligence, machine learning, variance, (19 more...)

arXiv.org Machine Learning

2605.28076

Country: North America > United States (0.46)

Genre:

Research Report (1.00)
Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

1a675d804f50509b8e21d0d3ca709d03-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 11:52:26 GMT

artificial intelligence, machine learning, stable diffusion, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

1a675d804f50509b8e21d0d3ca709d03-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 11:52:22 GMT

computational linguistic, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > Canada > Quebec (0.28)

Genre: Research Report > New Finding (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Sensing and Signal Processing > Image Processing (0.71)

Add feedback

Choosing the Right Regularizer for Applied ML: Simulation Benchmarks of Popular Scikit-learn Regularization Frameworks

Knight, Benjamin S., Bajaj, Ahsaas

arXiv.org Machine LearningApr-8-2026

This study surveys the historical development of regularization, tracing its evolution from stepwise regression in the 1960s to recent advancements in formal error control, structured penalties for non-independent features, Bayesian methods, and l0-based regularization (among other techniques). We empirically evaluate the performance of four canonical frameworks -- Ridge, Lasso, ElasticNet, and Post-Lasso OLS -- across 134,400 simulations spanning a 7-dimensional manifold grounded in eight production-grade machine learning models. Our findings demonstrate that for prediction accuracy when the sample-to-feature ratio is sufficient (n/p >= 78), Ridge, Lasso, and ElasticNet are nearly interchangeable. However, we find that Lasso recall is highly fragile under multicollinearity; at high condition numbers (kappa) and low SNR, Lasso recall collapses to 0.18 while ElasticNet maintains 0.93. Consequently, we advise practitioners against using Lasso or Post-Lasso OLS at high kappa with small sample sizes. The analysis concludes with an objective-driven decision guide to assist machine learning engineers in selecting the optimal scikit-learn-supported framework based on observable feature space attributes.

artificial intelligence, lasso, machine learning, (18 more...)

arXiv.org Machine Learning

2604.03541

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

When Stability Fails: Hidden Failure Modes Of LLMS in Data-Constrained Scientific Decision-Making

Riasat, Nazia

arXiv.org Machine LearningMar-18-2026

Large language models (LLMs) are increasingly used as decision-support tools in data-constrained scientific workflows, where correctness and validity are critical. However, evaluation practices often emphasize stability or reproducibility across repeated runs. While these properties are desirable, stability alone does not guar- antee agreement with statistical ground truth when such references are available. We introduce a controlled behavioral evaluation framework that explicitly sep- arates four dimensions of LLM decision-making: stability, correctness, prompt sensitivity, and output validity under fixed statistical inputs. We evaluate multi- ple LLMs using a statistical gene prioritization task derived from differential ex- pression analysis across prompt regimes involving strict and relaxed significance thresholds, borderline ranking scenarios, and minor wording variations. Our ex- periments show that LLMs can exhibit near-perfect run-to-run stability while sys- tematically diverging from statistical ground truth, over-selecting under relaxed thresholds, responding sharply to minor prompt wording changes, or producing syntactically plausible gene identifiers absent from the input table. Although sta- bility reflects robustness across repeated runs, it does not guarantee agreement with statistical ground truth in structured scientific decision tasks. These findings highlight the importance of explicit ground-truth validation and output validity checks when deploying LLMs in automated or semi-automated scientific work- flows.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2603.1584

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.46)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

3322a9a72a1707de14badd5e552ff466-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 06:21:40 GMT

clash, experiment, probability, (16 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)
Research Report > Strength Medium (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

94e70705efae423efda1088614128d0b-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 09:45:06 GMT

conditioning, effect size, lpcmci, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

A Limitations

Neural Information Processing SystemsFeb-8-2026, 12:06:23 GMT

Consequently, image datasets depicting these groups have limited capacity to fully represent these demographics and intersectional identities. While we recognize that assigning a bias score based on these limited resources might not be entirely accurate, it is a vital first step in the right direction. Moreover, the bias effect size (Eq 8) may sometimes be unreliable [Meade et al., As described in Sec. 5, we manually compare our best HardNeg Stable Diffusion with vanilla Stable A storefront with'Hello W orld' written on it. This leaves us with 6 categories and 104 prompts:Category Prompts Colours A brown bird and a blue bear. An elephant is behind a tree.

artificial intelligence, machine learning, stable diffusion, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

1a675d804f50509b8e21d0d3ca709d03-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 12:06:21 GMT

computational linguistic, diffusion model, retrieval, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report > New Finding (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Sensing and Signal Processing > Image Processing (0.91)

Add feedback

High-recall causal discovery for autocorrelated time series with latent confounders

Neural Information Processing SystemsDec-24-2025, 07:34:09 GMT

We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods for the case of autocorrelated continuous variables while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation.

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Software > Programming Languages (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Filters

Collaborating Authors

effect size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The conditional-mean barrier: From deterministic regression to conditional distribution learning

1a675d804f50509b8e21d0d3ca709d03-Supplemental-Conference.pdf

1a675d804f50509b8e21d0d3ca709d03-Paper-Conference.pdf

Choosing the Right Regularizer for Applied ML: Simulation Benchmarks of Popular Scikit-learn Regularization Frameworks

When Stability Fails: Hidden Failure Modes Of LLMS in Data-Constrained Scientific Decision-Making

3322a9a72a1707de14badd5e552ff466-Paper-Conference.pdf

94e70705efae423efda1088614128d0b-Paper.pdf

A Limitations

1a675d804f50509b8e21d0d3ca709d03-Paper-Conference.pdf

High-recall causal discovery for autocorrelated time series with latent confounders